Goto

Collaborating Authors

 order flow


ABIDES-MARL: A Multi-Agent Reinforcement Learning Environment for Endogenous Price Formation and Execution in a Limit Order Book

arXiv.org Artificial Intelligence

We present ABIDES-MARL, a framework that combines a new multi-agent reinforcement learning (MARL) methodology with a new realistic limit-order-book (LOB) simulation system to study equilibrium behavior in complex financial market games. The system extends ABIDES-Gym by decoupling state collection from kernel interruption, enabling synchronized learning and decision-making for multiple adaptive agents while maintaining compatibility with standard RL libraries. It preserves key market features such as price-time priority and discrete tick sizes. Methodologically, we use MARL to approximate equilibrium-like behavior in multi-period trading games with a finite number of heterogeneous agents-an informed trader, a liquidity trader, noise traders, and competing market makers-all with individual price impacts. This setting bridges optimal execution and market microstructure by embedding the liquidity trader's optimization problem within a strategic trading environment. We validate the approach by solving an extended Kyle model within the simulation system, recovering the gradual price discovery phenomenon. We then extend the analysis to a liquidity trader's problem where market liquidity arises endogenously and show that, at equilibrium, execution strategies shape market-maker behavior and price dynamics. ABIDES-MARL provides a reproducible foundation for analyzing equilibrium and strategic adaptation in realistic markets and contributes toward building economically interpretable agentic AI systems for finance.


Multi-Agent Reinforcement Learning for Market Making: Competition without Collusion

arXiv.org Artificial Intelligence

Algorithmic collusion has emerged as a central question in AI: Will the interaction between different AI agents deployed in markets lead to collusion? More generally, understanding how emergent behavior, be it a cartel or market dominance from more advanced bots, affects the market overall is an important research question. We propose a hierarchical multi-agent reinforcement learning framework to study algorithmic collusion in market making. The framework includes a self-interested market maker (Agent~A), which is trained in an uncertain environment shaped by an adversary, and three bottom-layer competitors: the self-interested Agent~B1 (whose objective is to maximize its own PnL), the competitive Agent~B2 (whose objective is to minimize the PnL of its opponent), and the hybrid Agent~B$^\star$, which can modulate between the behavior of the other two. To analyze how these agents shape the behavior of each other and affect market outcomes, we propose interaction-level metrics that quantify behavioral asymmetry and system-level dynamics, while providing signals potentially indicative of emergent interaction patterns. Experimental results show that Agent~B2 secures dominant performance in a zero-sum setting against B1, aggressively capturing order flow while tightening average spreads, thus improving market execution efficiency. In contrast, Agent~B$^\star$ exhibits a self-interested inclination when co-existing with other profit-seeking agents, securing dominant market share through adaptive quoting, yet exerting a milder adverse impact on the rewards of Agents~A and B1 compared to B2. These findings suggest that adaptive incentive control supports more sustainable strategic co-existence in heterogeneous agent environments and offers a structured lens for evaluating behavioral design in algorithmic trading systems.


TABL-ABM: A Hybrid Framework for Synthetic LOB Generation

arXiv.org Artificial Intelligence

The recent application of deep learning models to financial trading has heightened the need for high fidelity financial time series data. This synthetic data can be used to supplement historical data to train large trading models. The state-of-the-art models for the generative application often rely on huge amounts of historical data and large, complicated models. These models range from autoregres-sive and diffusion-based models through to architecturally simpler models such as the temporal-attention bilinear layer. Agent-based approaches to modelling limit order book dynamics can also recreate trading activity through mechanistic models of trader behaviours. In this work, we demonstrate how a popular agent-based framework for simulating intraday trading activity, the Chiarella model, can be combined with one of the most performant deep learning models for forecasting multi-variate time series, the T ABL model. This forecasting model is coupled to a simulation of a matching engine with a novel method for simulating deleted order flow. Our simulator gives us the ability to test the generative abilities of the forecasting model using stylised facts. Our results show that this methodology generates realistic price dynamics however, when analysing deeper, parts of the markets microstructure are not accurately recreated, highlighting the necessity for including more sophisticated agent behaviors into the modeling framework to help account for tail events.


Scaling Law for Large-Scale Pre-Training Using Chaotic Time Series and Predictability in Financial Time Series

arXiv.org Artificial Intelligence

Time series forecasting plays a critical role in decision-making processes across diverse fields including meteorology, traffic, electricity, economics, finance, and so on. Especially, predicting returns on financial instruments is a challenging problem. Some researchers have proposed time series foundation models applicable to various forecasting tasks. Simultaneously, based on the recognition that real-world time series exhibit chaotic properties, methods have been developed to artificially generate synthetic chaotic time series, construct diverse datasets and train models. In this study, we propose a methodology for modeling financial time series by generating artificial chaotic time series and applying resampling techniques to simulate financial time series data, which we then use as training samples. Increasing the resampling interval to extend predictive horizons, we conducted large-scale pre-training using 10 billion training samples for each case. We subsequently created test datasets for multiple timeframes using actual Bitcoin trade data and performed zero-shot prediction without re-training the pre-trained model. The results of evaluating the profitability of a simple trading strategy based on these predictions demonstrated significant performance improvements over autocorrelation models. During the large-scale pre-training process, we observed a scaling law-like phenomenon that we can achieve predictive performance at a certain level with extended predictive horizons for chaotic time series by increasing the number of training samples exponentially. If this scaling law proves robust and holds true across various chaotic models, it suggests the potential to predict near-future events by investing substantial computational resources. Future research should focus on further large-scale training and verifying the applicability of this scaling law to diverse chaotic models.


Why is the estimation of metaorder impact with public market data so challenging?

arXiv.org Artificial Intelligence

Transaction cost analysis is a fundamental aspect of financial trading and market impact is the main source of costs for medium and large sized investors [1]. Thus, estimating the potential impact and cost of a trading decision is important to assess its profitability. This is particularly true and challenging for metaorders, i.e. sequences of orders and trades executed gradually over a long time period and following a single investment decision. In fact, while there is a vast literature on estimating and modeling impact of individual trades (or orders) from public data, it is less clear if and how such models can be used to estimate the expected price trajectory of a metaorder and the associated impact cost. To this end, the industrial practice is to estimate market impact and the associated cost of a metaorder by using data on actual metaorder execution (for academic researches using this approach, see, for example, [2-5]). However this approach presents some pitfalls.


Many learning agents interacting with an agent-based market model

arXiv.org Artificial Intelligence

We consider the dynamics and the interactions of multiple reinforcement learning optimal execution trading agents interacting with a reactive Agent-Based Model (ABM) of a financial market in event time. The model represents a market ecology with 3-trophic levels represented by: optimal execution learning agents, minimally intelligent liquidity takers, and fast electronic liquidity providers. The optimal execution agent classes include buying and selling agents that can either use a combination of limit orders and market orders, or only trade using market orders. The reward function explicitly balances trade execution slippage against the penalty of not executing the order timeously. This work demonstrates how multiple competing learning agents impact a minimally intelligent market simulation as functions of the number of agents, the size of agents' initial orders, and the state spaces used for learning. We use phase space plots to examine the dynamics of the ABM, when various specifications of learning agents are included. Further, we examine whether the inclusion of optimal execution agents that can learn is able to produce dynamics with the same complexity as empirical data. We find that the inclusion of optimal execution agents changes the stylised facts produced by ABM to conform more with empirical data, and are a necessary inclusion for ABMs investigating market micro-structure. However, including execution agents to chartist-fundamentalist-noise ABMs is insufficient to recover the complexity observed in empirical data.


Short-Term Volatility Prediction Using Deep CNNs Trained on Order Flow

arXiv.org Artificial Intelligence

As a newly emerged asset class, cryptocurrency is evidently more volatile compared to the traditional equity markets. Due to its mostly unregulated nature, and often low liquidity, the price of crypto assets can sustain a significant change within minutes that in turn might result in considerable losses. In this paper, we employ an approach for encoding market information into images and making predictions of short-term realized volatility by employing Convolutional Neural Networks. We then compare the performance of the proposed encoding and corresponding model with other benchmark models. The experimental results demonstrate that this representation of market data with a Convolutional Neural Network as a predictive model has the potential to better capture the market dynamics and a better volatility prediction.


Deep Recurrent Modelling of Stationary Bitcoin Price Formation Using the Order Flow

arXiv.org Machine Learning

In this paper we propose a deep recurrent model based on the order flow for the stationary modelling of the high-frequency directional prices movements. The order flow is the microsecond stream of orders arriving at the exchange, driving the formation of prices seen on the price chart of a stock or currency. To test the stationarity of our proposed model we train our model on data before the 2017 Bitcoin bubble period and test our model during and after the bubble. We show that without any retraining, the proposed model is temporally stable even as Bitcoin trading shifts into an extremely volatile "bubble trouble" period. The significance of the result is shown by benchmarking against existing state-of-the-art models in the literature for modelling price formation using deep learning.


Universal features of price formation in financial markets: perspectives from Deep Learning

arXiv.org Machine Learning

Using a large-scale Deep Learning approach applied to a high-frequency database containing billions of electronic market quotes and transactions for US equities, we uncover nonparametric evidence for the existence of a universal and stationary price formation mechanism relating the dynamics of supply and demand for a stock, as revealed through the order book, to subsequent variations in its market price. We assess the model by testing its out-of-sample predictions for the direction of price moves given the history of price and order flow, across a wide range of stocks and time periods. The universal price formation model exhibits a remarkably stable out-of-sample prediction accuracy across time, for a wide range of stocks from different sectors. Interestingly, these results also hold for stocks which are not part of the training sample, showing that the relations captured by the model are universal and not asset-specific. The universal model -- trained on data from all stocks -- outperforms, in terms of out-of-sample prediction accuracy, asset-specific linear and nonlinear models trained on time series of any given stock, showing that the universal nature of price formation weighs in favour of pooling together financial data from various stocks, rather than designing asset-or sector-specific models as commonly done. Standard data normalizations based on volatility, price level or average spread, or partitioning the training data into sectors or categories such as large/small tick stocks, do not improve training results. On the other hand, inclusion of price and order flow history over many past observations improves forecasting performance, showing evidence of path-dependence in price dynamics. The authors thank seminar participants at the London Quant Summit 2018, JP Morgan and Princeton University for their comments. Computations for this paper were performed using a grant from the CFM-Imperial Institute of Quantitative Finance and the Blue Waters supercomputer grant "Distributed Learning with Neural Networks". This data may be put to use to explore the nature of the price formation mechanism which describes how market prices react to fluctuations in supply and demand. At a high level, a'price formation mechanism' is a map which represents the correspondence between the market price and variables such as price history and order flow: Price(t t) F ( Price history(0...t), Order Flow(0...t), Other Information) F (X t, t), where X t is a set of state variables (e.g., lagged values of price, volatility, and order flow), endowed with some dynamics and t is a random'noise' or innovation term representing the arrival of new information and other effects not captured entirely by the state variables.


Deep Learning Can Read The Tea Leaves In Market Data

International Business Times

Henri Waelbroeck, director of research at machine learning trade execution system Portware, says rather poetically that the system "reads the tea leaves" in market data to distinguish different sorts of orders and execute trades more efficiently. Portware uses artificial intelligence to help traders select the best algorithm for particular market conditions, asset class, broker, venue etc., interacting with the order flow and computing a mind-boggling array of variables in real time. Say you are buying a stock, and you predict there is likely to be more orders hitting the bid side of the spread in the next five minutes, you should be able to operate an efficient algorithm that only posts limit orders and collects the spread as it executes. Using an algorithm that crosses the spread in this instance would be wasteful since you expect order flow to be coming your way. Waelbroeck, formerly a professor at the Institute of Nuclear Sciences at the National University of Mexico, whose specialisms include genetic algorithms and chaos theory, said: "Just throwing machine learning at problems usually doesn't give a very good answer. You need to have a good analytical understanding of what's going on and this usually gives you a baseline model and then you find opportunities to insert machine learning tactically to exploit opportunities to improve the models."